Adapted from and thank to the first tutorial by Valliappa Lakshmanan, formerly at Climate Corp now at Google.
https://eng.climate.com/2015/10/27/how-to-read-and-display-nexrad-on-aws-using-python/
Amazon Simple Storage Service (Amazon S3) is object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web. It is designed to deliver 99.999999999% durability, and scale past trillions of objects worldwide.
Boto is a Python package that provides interfaces to Amazon Web Services.
In [48]:
#Lets import some stuff!
import boto
from boto.s3.connection import S3Connection
from datetime import timedelta, datetime
import os
import pyart
from matplotlib import pyplot as plt
import tempfile
import numpy as np
%matplotlib inline
From https://aws.amazon.com/noaa-big-data/nexrad/ :
The NEXRAD Level II archive data is hosted in the “noaa-nexrad-level2” Amazon S3 bucket in S3’s US East region. The address for the public bucket is:
http://noaa-nexrad-level2.s3.amazonaws.com
https://noaa-nexrad-level2.s3.amazonaws.com
Each volume scan file is its own object in Amazon S3. The basic data format is the following:
/<Year>/<Month>/<Day>/<NEXRAD Station>/<filename>
Where:
All files in the archive use the same compressed format (.gz). The data file names are, for example, KAKQ20010101_080138.gz. The file naming convention is:
GGGGYYYYMMDD_TTTTTT
Where:
GGGG = Ground station ID (map of ground stations) YYYY = year MM = month DD = day TTTTTT = time when data started to be collected (GMT)
Note that the 2015 files have an additional field on the file name. It adds “_V06” to the end of the file name. An example is KABX20150303_001050_V06.gz.
In [8]:
#first lets connect to the bucket
conn = S3Connection(anon = True)
bucket = conn.get_bucket('noaa-nexrad-level2')
In [10]:
#as we can see there is a LOT we can do with a bucket!!!
dir(bucket)
Out[10]:
The contents of the bucket are in bucket.list
In [12]:
my_list = bucket.list()
help(my_list)
We can see this is an iterator.. Printing the whole list would be YUUUUGE! so we want to subset it.. we can do this via the prefix keyword. We are then going to cast it to a list
In [15]:
my_pref = '2011/05/20/KVNX/'
bucket_list = list(bucket.list(prefix = my_pref))
In [20]:
print(bucket_list[0:10])
So we have a list of key (objects) in an S3 bucket. We can directly access the item and download it to a file using the contents_to_file method
In [34]:
home_dir = os.path.expanduser('~')
bucket_list[0].get_contents_to_filename(os.path.join(home_dir,'nexrad_tempfile'))
OK!! That was easy.. lets just take a quick look
In [36]:
radar = pyart.io.read(os.path.join(home_dir,'nexrad_tempfile'))
In [37]:
print(radar.info())
In [40]:
my_figure = plt.figure(figsize = [10,8])
my_display = pyart.graph.RadarDisplay(radar)
my_display.plot_ppi('reflectivity', 0, vmin = -12, vmax = 64)
Ok! How do I search for the volume I want? And make it open easily in Py-ART? Here is a little documented script
In [42]:
#Helper function for the search
def _nearestDate(dates, pivot):
return min(dates, key=lambda x: abs(x - pivot))
def get_radar_from_aws(site, datetime_t):
"""
Get the closest volume of NEXRAD data to a particular datetime.
Parameters
----------
site : string
four letter radar designation
datetime_t : datetime
desired date time
Returns
-------
radar : Py-ART Radar Object
Radar closest to the queried datetime
"""
#First create the query string for the bucket knowing
#how NOAA and AWS store the data
my_pref = datetime_t.strftime('%Y/%m/%d/') + site
#Connect to the bucket
conn = S3Connection(anon = True)
bucket = conn.get_bucket('noaa-nexrad-level2')
#Get a list of files
bucket_list = list(bucket.list(prefix = my_pref))
#we are going to create a list of keys and datetimes to allow easy searching
keys = []
datetimes = []
#populate the list
for i in range(len(bucket_list)):
this_str = str(bucket_list[i].key)
if 'gz' in this_str:
endme = this_str[-22:-4]
fmt = '%Y%m%d_%H%M%S_V0'
dt = datetime.strptime(endme, fmt)
datetimes.append(dt)
keys.append(bucket_list[i])
if this_str[-3::] == 'V06':
endme = this_str[-19::]
fmt = '%Y%m%d_%H%M%S_V06'
dt = datetime.strptime(endme, fmt)
datetimes.append(dt)
keys.append(bucket_list[i])
#find the closest available radar to your datetime
closest_datetime = _nearestDate(datetimes, datetime_t)
index = datetimes.index(closest_datetime)
localfile = tempfile.NamedTemporaryFile()
keys[index].get_contents_to_filename(localfile.name)
radar = pyart.io.read(localfile.name)
return radar
Lets take it for a spin!
In [50]:
base_date = "20161006_192700"
fmt = '%Y%m%d_%H%M%S'
b_d = datetime.strptime(base_date, fmt)
my_radar = get_radar_from_aws('KAMX',b_d )
max_lat = 27
min_lat = 24
min_lon = -81
max_lon = -77
lal = np.arange(min_lat, max_lat, .5)
lol = np.arange(min_lon, max_lon, .5)
display = pyart.graph.RadarMapDisplay(my_radar)
fig = plt.figure(figsize = [10,8])
display.plot_ppi_map('reflectivity', sweep = 0, resolution = 'c',
vmin = -8, vmax = 64, mask_outside = False,
cmap = pyart.graph.cm.NWSRef,
min_lat = min_lat, min_lon = min_lon,
max_lat = max_lat, max_lon = max_lon,
lat_lines = lal, lon_lines = lol)
In [ ]: